Batch Load

Introduction

GigaSpaces now has the ability for Smart DIHClosed to define batch loads via a standard pipeline interface. Batch load can now be performed without the use of IIDRClosed.

Configuring Batch Load: Helm

Enabling

Batch load is enabled through KubernetesClosed orchestrationClosed. It is not enabled by default.

The following flag has to be added to the helm command: global.batchload.enabled=true.

Adding the Agent

For each data source created, a separate Batch Load agent must be installed.  GigaSpaces also have a separate helm chart in order to install a batch load agent outside of the umbrella.  This would be used for the case where a client requires more than one agent.  For example, if there are multiple Oracle databases.

To install an agent under the DIHClosed umbrella: global.batchload-agent.enabled=true

For installing an agent and controlling its name: global.batchload-agent.agent.name=[name of agent].
It is also possible to install the batch load agent outside of the helm umbrella. This would be used in the case of a client needing more than one agent (for example, for multiple Oracle databases): helm install di-agent [dih repo name]/di-agents --version 2.0.0 --set agent.name=[name of agent]

Supported Data Source and Loading Types.

Currently, GigaSpaces supports the ability to perform full batch load from an Oracle DB.  More data sources and loading types will be added in future releases.

Creating a Data Source for Batch Load

Batch Load cannot be configured for a pipeline that is configured and running with CDCClosed (IIDR).  To enable Batch Load the appropriate configuration must be used when creating the Data Source.

To use Batch load when creating a Pipeline, add a new Pipeline by following steps as outlined in the User Guide: SpaceDeck - Spaces - Adding a Pipeline for Batch Load

User Flows: Creating a Pipeline using Batch Load

Batch Load cannot be configured for a pipeline that is configured and running with CDC (IIDR).  To enable Batch Load a new pipeline has to be created.

Oracle Database: Define Basic Full Batch Load Pipeline

  1. Login to SpaceDeck

  2. Define Oracle as the Data Source with the connector type = BATCHLOAD

  3. Create new pipeline.

Full batch load ends after the full load is completed. The status should be Completed. This differs from a CDC pipeline.